Architecting Fault-Tolerant Software Systems

نویسنده

  • Hasan Sözer
چکیده

The increasing size and complexity of software systems makes it hard to prevent orremove all possible faults. Faults that remain in the system can eventually lead toa system failure. Fault tolerance techniques are introduced for enabling systems torecover and continue operation when they are subject to faults. Many fault tolerancetechniques are available but incorporating them in a system is not always trivial. Weconsider the following problems in designing a fault-tolerant system. First, existingreliability analysis techniques generally do not prioritize potential failures from theend-user perspective and accordingly do not identify sensitivity points of a system.Second, existing architecture styles are not well-suited for specifying, communicatingand analyzing design decisions that are particularly related to the fault-tolerantaspects of a system. Third, there are no adequate analysis techniques that evaluatethe impact of fault tolerance techniques on the functional decomposition of softwarearchitecture. Fourth, realizing a fault-tolerant design usually requires a substantialdevelopment and maintenance effort.To tackle the first problem, we propose a scenario-based software architecture reli-ability analysis method, called SARAH that benefits from mature reliability engi-neering techniques (i.e. FMEA, FTA) to provide an early reliability analysis of thesoftware architecture design. SARAH evaluates potential failures from the end-userperspective to identify sensitive points of a system without requiring an implemen-tation.As a new architectural style, we introduce Recovery Style for specifying fault-tolerantaspects of software architecture. Recovery Style is used for communicating andanalyzing architectural design decisions and for supporting detailed design withrespect to recovery.As a solution for the third problem, we propose a systematic method for optimizingthe decomposition of software architecture for local recovery, which is an effectivefault tolerance technique to attain high system availability. To support the method,we have developed an integrated set of tools that employ optimization techniques,state-based analytical models (i.e. CTMCs) and dynamic analysis on the system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computing Science Architecting Fault Tolerant Systems Architecting Fault Tolerant Systems Bibliographical Details about the Author Computing Science Architecting Fault Tolerant Systems Architecting Fault Tolerant Systems Bibliographical Details about the Author Suggested Keywords Architecting Fault Tolerant Systems

As building trustworthy (dependable) systems is one of the major challenges faced by software developers, dealing with various threats (such as errors, faults and failures) is becoming one of the main foci of software and system research and development. In the core of ensuring system dependability is acceptance of the fact that errors always happen in spite of all the efforts to eliminate faul...

متن کامل

Architecting Fault-tolerant Component-based Systems: from requirements to testing

Fault tolerance is one of the most important means to avoid service failure in the presence of faults, so to guarantee they will not interrupt the service delivery. Software testing, instead, is one of the major fault removal techniques, realized in order to detect and remove software faults during software development so that they will not be present in the final product. This paper shows how ...

متن کامل

Specification-Driven Prototyping for Architecting Dependability

This paper describes a major part of an architecting methodology developed for safety-critical fault-tolerant software systems. The methodology coverage centers on specificationdriven prototyping. This approach to prototyping is seen to be superior to the customary approaches of throwaway and evolutionary prototyping. A still developmental form of representation, higher-level statecharts, provi...

متن کامل

Towards Systematic Design of Adaptive Fault Tolerant Systems

The development of modern distributed software systems poses a significant engineering challenge. The system architecture should exhibit plasticity and high degree of reconfigurability to enable an automated adaptation to continuously changing operating conditions and component failures. Traditional engineering approaches are inefficient to cope with complexity of such systems to ensure their r...

متن کامل

Workshop on Architecting Dependable Systems

In comparison with the state of the art in the field of Web Services architectures and their composition, we propose to exploit the concept of CA Actions to enable to dependable composition of Web Services. CA Actions introduce a mechanism for structuring fault tolerant concurrent systems through the generalization of the concepts of atomic actions and transactions, and are adapted to the compo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009